Efficient Tree Mining Using Reverse Search

نویسندگان

Tatsuya Asai

Hiroki Arimura

Takeaki Uno

Shin-ichi Nakano

Ken Satoh

چکیده

In this paper, we review our data mining algorithms for discovering frequent substructures in a large collection of semi-structured data, where both of the patterns and the data are modeled by labeled trees. These algorithms, namely FREQT for mining frequent ordered trees and UNOT for mining frequent unordered trees, efficiently enumerate all frequent tree patterns without duplicates using reverse search, which is a general scheme for designing efficient algorithms for hard enumeration problems, and incrementally compute of the occurrences of a pattern. We also discuss classes of trees to which reverse search is applicable, such as itemsets, sequential episodes, path trees, and graphs. Correspondence: Tatsuya Asai Department of Informatics, Kyushu University, 6-10-1 Hakozaki Higashi-ku, Fukuoka 812-8581, JAPAN E-mail: [email protected] phone: +81-92-642-2697, fax: +81-92-642-2698

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GTRACE-RS: Efficient Graph Sequence Mining using Reverse Search

The mining of frequent subgraphs from labeled graph data has been studied extensively. Furthermore, much attention has recently been paid to frequent pattern mining from graph sequences. A method, called GTRACE, has been proposed to mine frequent patterns from graph sequences under the assumption that changes in graphs are gradual. Although GTRACE mines the frequent patterns efficiently, it sti...

متن کامل

CHARM: An Efficient Algorithm for Closed Itemset Mining

The set of frequent closed itemsets uniquely determines the exact frequency of all itemsets, yet it can be orders of magnitude smaller than the set of all frequent itemsets. In this paper we present CHARM, an efficient algorithm for mining all frequent closed itemsets. It enumerates closed sets using a dual itemset-tidset search tree, using an efficient hybrid search that skips many levels. It ...

متن کامل

Efficient Mining of High Utility Sequential Patterns Over Data Streams

High utility sequential pattern mining has emerged as an important topic in data mining. Although several preliminary works have been conducted on this topic, the existing studies mainly focus on mining high utility sequential patterns (HUSPs) in static databases and do not consider the streaming data. Mining HUSPs over data streams is very desirable for many applications. However, addressing t...

متن کامل

BOSTER: An Efficient Algorithm for Mining Frequent Unordered Induced Subtrees

Extracting frequent subtrees from the tree structured data has important applications in Web mining. In this paper, we introduce a novel canonical form for rooted labelled unordered trees called the balanced-optimal-search canonical form (BOCF) that can handle the isomorphism problem efficiently. Using BOCF, we define a tree structure guided scheme based enumeration approach that systematically...

متن کامل

Survey of Efficient and Fast Nearest Neighbor Search For Spatial Query on Multidimensional Data

Spatial data mining is a special kind of data mining. Patterns, clusters, classifications, etc., can be derived from the big data available. Especially, nearest neighbor search approach with respect to a query point plays a key role in arriving at the final decision making. Like Computer Integrated Manufacturing, Facility Layout, Cellular Manufacturing, nearest neighbor search has been found se...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Efficient Tree Mining Using Reverse Search

نویسندگان

چکیده

منابع مشابه

GTRACE-RS: Efficient Graph Sequence Mining using Reverse Search

CHARM: An Efficient Algorithm for Closed Itemset Mining

Efficient Mining of High Utility Sequential Patterns Over Data Streams

BOSTER: An Efficient Algorithm for Mining Frequent Unordered Induced Subtrees

Survey of Efficient and Fast Nearest Neighbor Search For Spatial Query on Multidimensional Data

عنوان ژورنال:

اشتراک گذاری